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Abstract 

We present a novel scheme (“Categorical Basis Functions”, CBF) for object class representation in the brain and 
contrast it to the “Chorus of Prototypes” scheme recently proposed by Edelman [4]. The power and flexibility of CBF 
is demonstrated in two examples. CBF is then applied to investigate the phenomenon of Categorical Perception, in 
particular the finding by Biilthoff et al. [2] of categorization of faces by gender without corresponding Categorical 
Perception. Here, CBF makes predictions that can be tested in a psychophysical experiment. Finally, experiments are 
suggested to further test CBF. 
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1 Introduction 

Object categorization is a central yet computationally diffi¬ 
cult cognitive task. For instance, visually similar objects can 
belong to different classes, and conversely, objects that ap¬ 
pear rather different can belong to the same class. Categoriza¬ 
tion schemes may be based on shape similarity ( e.g., “human 
faces”), on conceptual similarity (e.g., “chairs”), or on more 
abstract features (e.g., “Japanese cars”, “green cars”). What 
are possible computational mechanisms underlying catego¬ 
rization in the brain? 

Edelman has recently presented an object representation 
scheme called “Chorus of Prototypes” (COP) [4] where ob¬ 
jects are categorized by their similarities to reference shapes, 
or “prototypes”. While this categorization scheme is of ap¬ 
pealing simplicity, the reliance on a single metric in a global 
shape space imposes severe limitations on the kinds of cate¬ 
gories that can be represented. We will discuss these short¬ 
comings and present a more general model of object cate¬ 
gorization along with a computational implementation that 
demonstrates the scheme’s capabilities, relate the model to 
recent psychophysical observations on categorical perception 
(CP), and discuss some of the model’s predictions. 

2 Chorus of Prototypes (COP) 

In COP, “the stimulus is first projected into a high¬ 
dimensional measurement space, spanned by a bank of 
[Gaussian] receptive fields. Second, it is represented by 
its similarities to reference shapes” ([4], p. 112, caption to 
Fig. 5.1). 

The categorization of novel objects in COP proceeds as fol¬ 
lows (ibid., p. 118): 

1. A category label is assigned to each of the training ob¬ 
jects (“reference objects”), for each of which an RBF 
network is trained to respond to the object from every 
viewpoint; 

2. a test object is represented by the activity pattern it 
evokes over all the output units of the reference object 
RBF networks (i.e., the “similarity to reference shapes” 
above); 

3. categorization is performed using the activity pattern 
and the labels associated with the output units of the ref¬ 
erence object RBF networks. Categorization procedures 
explored were winner-take-all, and A:-nearest-neighbor 
using the training views (this time taking the prototypes 
to be not the objects but the object views), i.e., the cen¬ 
ters of individual RBF units in each network, with the 
class label in this cased based on the label of the major¬ 
ity of the k closest stored views to the test stimulus. 

The appealingly simple design of COP also seems to be its 
most serious limitation: While a representation based solely 
on shape similarities seems to be suited for the taxonomy of 
some novel objects (cf. Edelman’s example of the descrip¬ 
tion of a giraffe as a “cameleopard” [4]), such a representa¬ 
tion appears too impoverished when confronted with objects 


that can be described on a variety of levels: A car, for in¬ 
stance, can look like several other cars (and also unlike many 
other objects), but it could also be described as a “cheap” car, 
a “green”* car, a “Japanese” car, an “old" car, etc. — dif¬ 
ferent qualities that are not simply or naturally summarized 
by shape similarities to individual prototypes but nevertheless 
provide useful information to classify or discriminate the ob¬ 
ject in question from other objects of similar shape. The fact 
that an object can be described in such abstract categories, 
and that this information appears to be used in recognition 
and discrimination, as indicated by the findings on categori¬ 
cal perception (see below), calls for an extension of Chorus to 
permit the use of several categorization schemes in parallel, 
to allow the representation of an object within the framework 
of a whole dictionary of categorization schemes that offers a 
more natural description of an object than one global shape 
space. 

While Edelman ([4], p. 244) suggests a refinement of Cho¬ 
rus where weights are assigned to different dimensions driven 
by task demands, it is not clear how this can happen in one 
global shape space if two objects can be judged as very sim¬ 
ilar under one categorization scheme but as rather different 
under another (as, for instance, a chili pepper and a candy 
apple in terms of color and taste, resp.). Use of different cat¬ 
egorization schemes appears to require reversible temporary 
warping of shape space depending on which categorization 
scheme is to be used, which runs counter to the notion of one 
general representational space. 

3 A Novel Scheme: Categorical Basis 
Functions (CBF) 

In CBF, the receptive fields of stimulus-coding units in mea¬ 
surement space are not constrained to lie in any specific 
class — unlike in COR there are no class labels associated 
with these units. The input ensemble drives the unsuper¬ 
vised, i.e., task-independent learning of receptive fields. The 
only requirement is that the receptive fields of these stim¬ 
ulus space-coding units (SSCUs) cover the stimulus space 
sufficiently to allow the definition of arbitrary classifica¬ 
tion schemes on the stimulus space (in the simplest version, 
“learning” just consists in storing all the training examples by 
allocating an SSCU to each training stimulus). 

These SSCUs in turn serve as inputs to units that are trained 
on categorization tasks in a supervised way — in fact, if each 
training stimulus is represented by one SSCU, then the net¬ 
work would be identical to a standard radial basis function 
(RBF) network. Figure 1 illustrates the CBF scheme. 

Novel stimuli in this framework evoke a characteristic ac¬ 
tivation pattern over the existing categorization units (as well 
as over the SSCUs). In fact, CBF can be seen as an extension 
of COP: instead of a representation based on similarity in a 
global shape space alone (as in “the object looks like xyz”, 

‘The idea of representing color through similarities to prototype 
objects seems especially awkward considering that it first requires 
the build-up of a library of objects of a certain color with the sole 
purpose of allowing to “average out” object shape. 
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Figure 2: Illustration of the cat/dog stimulus space. The stimulus 
space is spanned by six objects, three “cats” and three “dogs”. Our 
morphing software [13] allows us to generate 3D objects that are ar¬ 
bitrary combinations of the six prototypes. The lines show possible 
morph directions between two prototypes each, as used in the test 
set. 

where x,y,z can be objects for which individual units have 
been learned), abstract features, which are the result of prior 
category learning, are equally valid for the description of an 
object (as in “the object looks expensive/old/pink”). Hence, 
an object is not only represented by expressing its similarity 
to learned shapes but also by its membership to learned cate¬ 
gories, providing a natural basis for object description. 

In the proof-of-concept implementation described in the 
following, SSCUs are identical to the view-tuned units from 
the model by Riesenhuber and Poggio [12] (in reality, when 
objects can appear from different views, they could also be 
view-invariant — note that the view-tuned units are already 
invariant to changes in scale and position [12]). For simplic¬ 
ity, the unsupervised learning step is done using k-means, or 
just by storing all the training exemplars, but more refined 
unsupervised learning schemes, which better reflect the struc¬ 
ture of the input space, such as mixture-of-Gaussians or other 
probability density estimation schemes, or learning rules that 
provide invariance to object transformations [15] are likely 
to improve performance. Similarly, the supervised learning 
scheme used (Gaussian RBF) can be replaced by more bio¬ 
logically plausible or more sophisticated algorithms (see dis¬ 
cussion). 

3.1 An Example: Cat/Dog Classification 

To illustrate the capabilities of CBF, the following simulation 
was performed: We presented the hierarchical object recog¬ 
nition system (up to the C2 layer) of Riesenhuber & Poggio 
[12] with 144 randomly selected morphed animal stimuli, as 
used in a very recent monkey physiology experiment [6] (see 
Fig. 2). 

A view-tuned model unit was allocated for each training 
stimulus, yielding 144 view-tuned units (results were similar 


Figure 3: Response of the categorization unit (based on 144 SSCU, 
256 afferents to each SSCU, asscu = 0.7) along the nine class 
boundary-crossing morph lines. All stimuli in the left half of the 
plot are “cat” stimuli, all on the right-hand side are “dogs” (the class 
boundary is at 0.5). The network was trained to output 1 for a cat and 
-1 for a dog stimulus. The thick dashed line shows the average over 
all morph lines. The solid horizontal line shows the class boundary 
in response space. 

if the 144 stimuli were clustered into 30 units using k-means, 
see appendix). The activity patterns over the 144 units to each 
of the 144 stimuli were used as inputs to train a gaussian 
RBF output unit, using the class labels 1 for cat and -1 for 
dog as the desired outputs. The categorization performance 
of this unit was then tested with the same test stimuli as in the 
physiology experiment (which were not part of the training 
set). More precisely, the testing set consisted of the 15 lines 
through morph space connecting each of the prototypes, each 
subdivided into 10 intervals, with the exclusion of the stimu¬ 
lus at the mid-points (which in the case of lines crossing the 
class boundary would lie right on the class boundary, with 
an undefined label), yielding a total of 126 stimuli. Figure 
3 shows the response of the categorization unit to the stimuli 
on the category boundary-crossing morph lines, together with 
the desired label. A categorization was counted as correct if 
the sign of the network output was identical to the sign of the 
class label. 

Performance on the training set was 100% correct, perfor¬ 
mance on the test set was 97%, comparable to monkey per¬ 
formance, which was over 90% [6]. The four categorization 
errors the model makes lie right at the class boundary. 

3.2 Introduction of parallel categorization schemes 

To demonstrate how different classification schemes can be 
used in parallel within CBF, we also trained a second net¬ 
work to perform a different categorization task on the same 
stimuli. The stimuli were resorted into three classes, each 
based on one cat and one dog prototype. For this categoriza¬ 
tion task, three category units were trained (on a training set 
of 180 animal morphs, taken from training sets of an ongoing 
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stimulus space-covering units (SSCUs) 
(unsupervised training) 


Riesenhuber & Poggio 
model of view-tuned units 



Figure 1: Cartoon of the CBF categorization scheme, illustrated with the example domain of cars. Stimulus space-covering units (SSCUs) 
are the view-tuned units from the model by Riesenhuber & Poggio [12]. They self-organize to respond to representatives of the stimulus 
space so that they “cover” the whole input space, with no explicit information about class boundaries. These units then serve as inputs to 
task-related units that are trained in a supervised way to perform the categorization task (e.g., to distinguish American-built cars from imports, 
or compacts from sedans etc.). In the proof-of-concept implementation described in this paper, the unsupervised learning stage is done via 
k-means clustering, or just by storing all the training exemplars, and the supervised stage consists of an RBF network. 
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physiology project), each one to respond at a level of 1 for 
stimuli belonging to “its” class and a level of -1 for stimuli 
from the other two classes, t Each category unit received in¬ 
put from the same 144 SSCUs as the cat/dog category unit 
described above. 

As mentioned, it is an open question how to best perform 
multi-class classification. We evaluated two strategies: i) cat¬ 
egorization is said to be correct if the maximally activated 
category unit corresponds to the true class (“max” case); ii) 
categorization is correct if the signs of the three category units 
are equal to the correct answer (“sign” case). 

Performance on the training set in the “max” as well as in 
the “sign” case was 100% correct. On the testing set, per¬ 
formance using the “max” rule was 74%, whereas the perfor¬ 
mance for the “sign” rule was 61% correct, the lower numbers 
on the test set as compared to the cat/dog task reflecting the 
increased difficulty of the three-way categorization. We are 
currently training a monkey on the same categorization task, 
and it will be very interesting to compare the animal’s perfor¬ 
mance on the test set to the model’s performance. 

4 Interactions between categorization and 
discrimination: Categorical Perception 

When discriminating objects, we commonly do not only rely 
on simple shape cues but also take more complex features 
into account. For example, we can describe a face in terms 
of its expression, its age, gender etc. to provide additional 
information that can be used to discriminate this face from 
other faces. This suggests that training on categorization 
tasks could be of use also for object discrimination. 

The influence of categories on perception is expected to be 
especially strong for stimuli in the vicinity of a class bound¬ 
ary: In the cat/dog categorization task described in the pre¬ 
vious paragraph, the goal was to classify all members of one 
class the same way, irrespective of their shape. Hence, when 
presented with two stimuli from the same class, the catego¬ 
rization result will ideally not allow to discriminate between 
the two stimuli. On the other hand, two stimuli from differ¬ 
ent classes are labelled differently. Thus, one would expect 
greater accuracy in discriminating stimulus pairs from differ¬ 
ent classes than pairs belonging to the same class (note that in 
this paper we are not dealing with the discrimination process 
itself — while several mechanisms have been proposed, such 
as a representation based directly on the SSCU activation pat¬ 
tern, or one based on the activity pattern over prototypes such 
as view-invariant RBF units [4, 9], we in this section only dis¬ 
cuss how prior training on categorization tasks can provide 
additional information to the discrimination process, without 
regard to how the latter might be implemented computation¬ 
ally). 

This phenomenon, called Categorical Perception [8], 
where linear changes in a stimulus dimension are associated 

'Multi-class classification is a challenging and yet unsolved 
computational problem — the scheme employed here was chosen 
for its simplicity. 


with nonlinear perceptual effects, has been observed in nu¬ 
merous experiments, for instance in color or phoneme dis¬ 
crimination. 

A recent experiment by Goldstone [7] investigated Cate¬ 
gorical Perception (CP) in a task involving training subjects 
on a novel categorization. In particular, subjects were trained 
on a combined task that first required them to categorize stim¬ 
uli (rectangles) according to size or brightness or both and 
then to discriminate stimuli from the same set in a same- 
different design. 

The study found evidence for acquired distinctiveness , 
i.e., cues (size and brightness, resp.) that were task-relevant 
became perceptually salient even during other tasks. The 
task-relevant interval of the task-relevant dimension became 
selectively sensitized, i.e., discrimination of stimuli in this 
range improved (local sensitization at the class-boundary — 
the classical Categorical Perception effect), but dimension¬ 
wide sensitization was, to a lesser degree, also found ( global 
sensitization ). Fess sensitization occured when subjects had 
to categorize according to size and brightness, indicating 
competition between those dimensions. 

4.1 Categorical Perception in CBF 

The CBF scheme suggests a simple explanation for category- 
related influences on perception: When confronted with two 
stimuli differing along the stimulus dimension relevant for 
categorization, the different respective activation levels of the 
categorization unit provide additional information to base the 
discrimination on, and thus discrimination across the cate¬ 
gory boundary is facilitated, as compared to the case where no 
categorization network has been trained. Fig. 4 illustrates this 
idea: The (continuous) output of the categorization unit(s) 
provides additional input to the discrimination network in a 
discrimination task. In a categorization task, the output of the 
category unit is thresholded to arrive at a binary decision, as 
is the output of the discrimination network in a yes/no dis¬ 
crimination task. 

In particular, global sensitization would be expected as a 
side effect of training the categorization unit if its response 
is not constant within the classes, which is just what was ob¬ 
served in the simulations shown above (Fig. 3): The “catness” 
response level of the categorization unit decreases as stim¬ 
uli are morphed from the cat prototypes to cats at the class 
boundary and beyond. Its output is then thresholded to arrive 
at the categorization rule, which determines the class by the 
sign of the response (cf. above). Local sensitization (Cate¬ 
gorical Perception) occurs as a result of a stronger response 
difference of the categorization unit for stimulus pairs cross¬ 
ing the class boundary than for pairs where both members 
belong to the same class. 

In agreement with the experiment by Goldstone [7], we 
would expect competition between different dimensions in 
CBF when class boundaries run along more than one dimen¬ 
sion (e.g., two, as in the experiment), as compared to a class 
boundary along one dimension only: For the same physi¬ 
cal change in one stimulus property (one dimension), the re- 
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Figure 4: Sketch of the model to explain the influence of experience with categorization tasks on object discrimination, leading to global 
and local (Categorical Perception) sensitization. Key is the input of the category-tuned unit(s) to the discrimination network (which is shown 
here for illustrative purposes as receiving input from the SSCU layer, but this is just one of several alternatives), shown by the thick horizontal 
arrow. 


sponse of the categorization unit should change more in the 
one-dimensional than in the two-dimensional case since in 
the latter case crossing the class boundary requires change of 
the input in both dimensions. 

4.2 Categorization with and without Categorical 
Perception 

Biilthoff et al. have recently reported [2] that discrimination 
between faces is not better near the male/female boundary, 
i.e., they did not find evidence for CP in their study, even 
though subjects could clearly categorize face images by gen¬ 
der. 

Such categorization without CP can be understood within 
CBF: Following the simulations described above, CP in CBF 
is expected if the response of the category unit shows a 
stronger drop across the class boundary than within a class, 
for the same distance in morph space. Suppose now the slope 
of the categorization unit’s response is uniform across the 
stimulus space, from the prototypical examplars for one class 
(e.g., the “masculine men”) to the prototypical exemplars of 
the other class (e.g., the “feminine women”). If the subject 
is forced to make a category decision, e.g., using the sign 
of the category unit’s response, as above, the stimulus en¬ 
semble would be clearly divided into two classes (noise in 
the category unit’s response would lead to a smoothed out 
sigmoidal categorization curve). However, in a discrimina¬ 
tion task, the difference of response values of the category 
unit for two stimuli across the boundary would not be differ¬ 


ent from the difference for two stimuli within the same class 
(if the within-pair distance for both pairs with respect to the 
category-relevant dimension is the same). Hence, no Cate¬ 
gorical Perception, or, more precisely, any local sensitization 
would be expected. 

In CBF, the slope of a category unit’s response curve is in¬ 
fluenced by the extent of the training set with respect to the 
class boundary. To demonstrate this, we trained a cat/dog 
category unit as described above using four training sets dif¬ 
fering in how close the representatives of each class were al¬ 
lowed to get to the class boundary (which was again defined 
by an equality in the sum over the cat and dog coefficients). 
Introducing the “crossbreed coefficent”, c, of a stimulus be¬ 
longing to a certain class (cat or dog) as the coefficient sum of 
its corresponding vector in morph space over all prototypes of 
the other class (dog or cat, resp.), training sets differed in the 
maximum value of c, ranging from 0.1 to 0.4 in steps of 0.4 (c 
values of stimuli in each training set were chosen uniformly 
within the permissible interval, and training sets contained an 
equal number of stimuli, i.e., 200). The first case, c = 0.1, 
thus contained stimuli that were very close to the prototypical 
representatives of each class, whereas the c = 0.4 set con¬ 
tained cats with strong dog components and dogs with strong 
cat components, resp. 

Fig. 5 shows how the average response along the morph 
lines differs for the two cases c = 0.1 and c = 0.4. The leg¬ 
end shows in parentheses the performance on the training set 
and on the test set, resp.; the number after the colon shows the 
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Figure 5: Average responses over all morph lines for the two net¬ 
works (parameters as in Fig. 3) trained on data sets with c = 0.1 and 
c = 0.4, respectively. The legend shows in parentheses the perfor¬ 
mance (on the training set and on the test set, resp.); the number af¬ 
ter the colon shows the average change of response across the morph 
line (absolute value of response difference at positions 0.4 and 0.6) 
divided by the response difference for that morph line averaged over 
all other stimulus pairs 0.2 units apart. 

average change of response across the morph line (absolute 
value of response difference at positions 0.4 and 0.6) relative 
to the response difference for that morph line averaged over 
all other stimulus pairs 0.2 units apart. While categorization 
performance in both cases is very similar (93% vs. 94% cor¬ 
rect on the test set), the relative change across the class border 
is much greater for the c = 0.4 case than in the c = 0.1 case, 
where the response drops almost linearly from position 0.2 to 
position 0.9 on the morph line (incidentally, the relative drop 
of 1.6 in the c = 0.4 case is very similar to the drop observed 
in prefrontal cortical neurons of a monkey trained on the same 
task [6] with the same maximum c value). 

Thus, CBF predicts that the amount of categorical percep¬ 
tion is related to the extent of the training set with respect 
to the class boundary: If the training set for a categorization 
task is sparse around the class boundary (as is the case for 
face gender classification where usually most of the training 
examplars clearly belong to one or the other category with a 
comparatively lower number of androgynous faces), a lower 
degree of CP would be expected than in the case of a training 
set that extends to the class boundary. 

It will be interesting to test this hypothesis experimentally 
by training subjects on a categorization task where differ¬ 
ent groups of subjects are exposed to subsets of the stimu¬ 
lus space differing in how close the training stimuli come to 
the boundary. Category judgment can then be tested for (ran¬ 
domly chosen) stimuli lying on lines in morph space passing 
through the class boundary. In a second step, subjects would 
be switched to a discrimination task to look for evidence of 
CP. The prediction would be that while subjects in all groups 
would divide the stimulus space into categories (not neces¬ 
sarily in the same way or with the same degree of certainty, 
as there would be uncertainty regarding the exact location of 
the class boundary that increases for groups that were only 
trained on stimuli far away from the boundary), the degree of 
CP should increase with the closeness of the training stim¬ 


uli to the true class boundary. Naturally, the categorization 
scheme used in this task should be novel for the subjects to 
avoid confounding influences of prior experience. Hence, 
a possible alternative to the cat/dog categorization task de¬ 
scribed above would be to group car prototypes (randomly) 
into two classes and then train subjects on this categorization 
task. 

One issue to be addressed is whether the fact that subjects 
are trained on different stimulus sets will influence discrimi¬ 
nation performance (even in the absence of any categorization 
task). For the present case, simulations indicate only a small 
effect of the different training sets on discrimination perfor¬ 
mance (see Fig. 6), but it is unclear whether this transfers 
to other stimulus sets. However, while the different train¬ 
ing groups might differ in their performance on the untrained 
part of the stimulus space due to the different SSCUs learned, 
the prediction is still that the area of improved discriminabil- 
ity should coincide with the subjects’ location of the class 
boundary rather than with the extent of the training set. To 
avoid range and anchor effects [3] (see footnote below), stim¬ 
uli should be chosen from a continuum in morph space, e.g., a 
loop. 

Why has no CP been found for gender classification while 
other studies have found evidence for CP in emotion classi¬ 
fication using line drawings [5] as well as photographic im¬ 
ages of faces [3]?* For the case of emotions, subjects are 
likely to have had experience with not just the “prototypical” 
facial expression of an emotion but also with varying combi¬ 
nations and degrees of expressions and have learned to cate¬ 
gorize them appropriately, corresponding to the case of high 
c values in the cat/dog case described above, where CP would 
be expected. 

5 COP or CBF? — Suggestion for 
Experimental Tests 

It appears straightforward to design a physiological experi¬ 
ment to elucidate whether COP or CBF better model actual 
category learning: A monkey is trained on two different cat¬ 
egorization tasks using the same stimuli (for example, the 
cat/dog stimuli used in the simulations above). The responses 
of prefrontal cortical neurons (which have been shown in a 
preliminary study using these stimuli [6] to carry category in¬ 
formation) to the test stimuli are then recorded from while 
the monkey is passively viewing the test stimuli (e.g., dur¬ 
ing a fixation task). In CBF, we would expect to find neu¬ 
rons showing tuning to either categorization scheme, whereas 
COP would predict that cell tuning reflects a single metric in 
shape space. In the former case, it will be interesting to com¬ 
pare neural responses to the same stimuli while the monkey 
is performing the two different categorization tasks to look 

+CP has also been claimed to occur for facial identity [1], but the 
experimental design appears flawed as stimuli in the middle of the 
continuum were presented more often than the ones at the extremes, 
and prototypes were easily extracted from the discrimination task, 
biasing subjects discrimination responses towards the middle of the 
continuum [11], 
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(c=0.1) - (c=0.4) 



Figure 6: Comparison of Euclidean distances of activation patterns (over 144 SSCU, as used in the previous simulations) for stimuli lying at 
two different positions on morph lines for the cases of c = 0.1 and c = 0.4. The left panel shows the average euclidean distance between the 
activity pattern for a stimulus at position n (y-axis) and a stimulus on the same morph line at position n 2 (x-axis), for the network trained on 
the data set with c = 0.1 (note that there were no stimuli at the 0.5 position). The middle panel shows the corresponding plot for the network 
trained on c = 0.4, while the right panel shows the difference between the two plots: Differences between the two networks are usually quite 
low in magnitude (note the different scaling on the z-axes), suggesting that discrimination performance in the c = 0.1 case should be close to 
the c = 0.4 case. 


at response enhancement/suppression of neurons involved in 
the different categorization tasks. 

6 Conclusions 

We have described a novel model of object representation 
that is based on the concurrent use of different categorization 
schemes using arbitrary class definitions This scheme pro¬ 
vides a more natural basis for classification than the “Chorus 
of Prototypes” with its notion of one global shape space. In 
our framework, called “Categorical Basis Functions” (CBF), 
the stimulus space is represented by units whose receptive 
fields self-organize without regard to any class boundary. In 
a second, supervised stage, categorization units receiving in¬ 
put from the stimulus space-covering units (SSCUs) come to 
learn different categorization task(s). Note that this just de¬ 
scribes the basic framework — one could imagine, for in¬ 
stance, the addition of slow time-scale top-down feedback to 
the SSCU layer, analogous to the GRBF networks of Pog- 
gio and Girosi [10], that could enhance categorization per¬ 
formance by optimizing the receptive fields of SSCUs. Simi¬ 
larly, the algorithms used to learn SSCUs (k-means clustering 
or simple storage of all training examples) and the catego¬ 
rization units (RBF) should just be taken as examples. For 
instance, (a less biological version of) CBF could also be im¬ 
plemented using Support Vector Machines [14]. In this case, 
a categorization unit would only be connected to a sparse sub¬ 
set of SSCUs, paralleling the sparse connectivity observed in 
cortex. 

A final note concerns the advantages of CBF for the learn¬ 
ing and representation of class hierarchies: While the simula¬ 
tions presented in this paper limited themselves to one level of 
categorization, it is easily possible to add additional layers of 
sub- or superordinate level units receiving inputs from other 
categorization units. For instance, a unit learning to classify 
a certain breed of dog could receive input not only from the 


SSCUs but also from a “generic dog” unit, or a “quadruped” 
unit could be trained receiving inputs from units selective for 
different classes of four-legged animals, in both cases greatly 
simplifying the overall learning task. 
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Appendix: Parameter Dependence of 
Categorization Performance for the Cat/Dog 
Task 


n=144 a=40 sig=0.2 (100,66): 0.48 



n=144 a=40 sig=0.7 (100,58): 0.85 


n=144 a=256 sig=0.2 (100,94): 1 



n=144 a=256 sig=0.7 (100,97): 2.1 




Figure 7: Output of the categorization unit trained on the cat/dog 
categorization task from section 3.1, for 144 SSCUs (where each 
SSCU was centered at a training example) and two different values 
for the <7 of the SSCU and the number of afferents to each SSCU 
(choosing either all 256 C2 units or just the 40 strongest afferents, 
cf. [12]). The numbers in parentheses in each plot title refer to the 
unit's categorization performance on the training and on the test set, 
resp. The number on the right-hand side is the average response 
drop over the category boundary relative to the average drop over 
the same distance in morph space within each class (cf. section 4.2). 
Note the poor performance on the test set for a low number of affer¬ 
ents to each unit, which is due to overtraining. The plot in the lower 
right shows the unit from Fig. 3. 


n=30 a=40 sig=0.2 (98,91): 0.78 n=30 a=256 sig=0.2 (99,91): 0.77 






Figure 8: Same as the above figure, but for a SSCU representation 
based on just 30 units, chosen by a k-means algorithm from the 144 
centers in the previous example. 
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